A real-time epilepsy seizure detection approach based on EEG using short-time Fourier transform and Google-Net convolutional neural network

Epilepsy is one of the most common brain disorders, and seizures of epilepsy have severe adverse effects on patients. Real-time epilepsy seizure detection using electroencephalography (EEG) signals is an important research area aimed at improving the diagnosis and treatment of epilepsy. This paper proposed a real-time approach based on EEG signal for detecting epilepsy seizures using the STFT and Google-net convolutional neural network (CNN). The CHB-MIT database was used to evaluate the performance, and received the results of 97.74 % in accuracy, 98.90 % in sensitivity, 1.94 % in false positive rate. Additionally, the proposed method was implemented in a real-time manner using the sliding window technique. The processing time of the proposed method just 0.02 s for every 2-s EEG episode and achieved average 9.85- second delay in each seizure onset.


Introduction
Epilepsy is a neurological disorder characterized by abnormal synchronous electrical activity in the brain [1].It affects a significant number of people worldwide, with an estimated 50 million individuals living with epilepsy [2].Given the high prevalence of epilepsy and the impact of seizures on patients, there is a growing need to improve the efficiency of diagnosis and treatment.One approach to address this is to develop real-time automatic detection systems using electroencephalogram (EEG) signals [3].
The main features of EEG seizure active signals are the spikes waves and sharp waves.To distinguish the spike waves and sharp waves from the normal EEG waves, many signal processing methods were proposed in recent years.The study by Alharthi, M.K et al. focused on EEG seizure onset detection using a combination of the discrete wavelet transform (DWT) and a deep learning model consisting of a 1D-Convolutional Neural Network (CNN) with bidirectional long short-term memory (Bi-LSTM) [4].They achieved the results of 96.87 % accuracy, 96.85 % sensitivity and 96.98 % precision in their experiment.Zarei, A et al. explored the use of orthogonal matching pursuit (OMP) combined with a support vector machine (SVM) classifier for detecting seizure onsets [5].Their study reported a sensitivity of 96.81 % and a false positive (FP) rate of 2.74 %.In another study by Bhattacharyya, A. et al., the tunable-Q wavelet transform (TQWT) was proposed to decompose EEG epilepsy signals [6].They then calculated entropy measures based on the decomposed signals.Their approach achieved an accuracy of 98.6 %.Li, C. et al.Proposed the use of common spatial pattern (CSP) to select relevant EEG channels for seizure onset detection [7].They combined CSP with empirical mode decomposition (EMD) and an SVM model and obtained the result of a sensitivity of 97.34 % and a false positive rate of 2.5 %.In the research conducted by Oweis, R.J. et al., they utilized the Hilbert-Huang transform (HHT) in the frequency domain for seizure detection [8].Their method achieved an accuracy of 94 % and specificity of 96 %.Hu, W. et al. highlighted the use of mean amplitude spectrum (MAP) combined with a CNN model for classifying the seizure active and seizure free data [9].Their approach reported a classification accuracy of 86.25 %.Bomela, W. et al. developed a complex brain connection method for real-time seizure detection [10].They supported the result of a sensitivity of 93.6 % and a false positive rate of 0.16 per hour in their study.Shayeste, H. et al. developed a short-time Fourier transform (STFT) algorithm based on the bagging technique and a decision tree model for automatic seizure detection [11].Their approach received high accuracy, sensitivity, and specificity, with reported values of 99.56 %, 99.52 %, and 99.62 %, respectively.Amiri, M. et al. utilized Sparse CSP combined with an adaptive STFT-based synchro squeezing transform for automatic seizure detection [12].Their method achieved a sensitivity of 98.44 %, specificity of 99.19 %, and accuracy of 98.81 %.In our previous work, DWT and RUSBoosted tree Ensemble methods were combined to detect EEG epilepsy seizure onset in real-time application, and achieved 96.15 % sensitivity, 96.38 % accuracy, 3.24 % FP rate and 10.42 s delay results [13].Furthermore, TQWT and CNN model were also applied our seizure detection work, the results received 97.57% in accuracy, 98.90 % in sensitivity, 2.13 % in FP rate and 10.46-s delay [14].
To address the robustness issue in EEG-based epilepsy detection, researchers have also developed machine learning and deep learning methods.Omidvar, M. et al.Proposed the use of a SVM model to classify EEG signals decomposed at the 5th level using the 5db DWT [15].They reported an accuracy of 98.7 % as the result in their paper.Donos, C. et al. employed the random forest algorithm to detect early seizures using intracranial EEG data [16].Their method obtained a result of 93.84 % sensitivity.Gao, Y. et al. focused on deep learning and utilized a deep CNN to classify seizure activity in EEG data [17].Their approach achieved an average classification accuracy of 90 % in epilepsy seizure detection.Cao, X. et al. used LSTM networks to directly detect seizure onset [18].They provided the result with an accuracy of 96.3 % in their experiment.Wang, X et al. proposed a stacked 1D-CNN model for automatic seizure onset detection [19].Their approach obtained an accuracy of 88.14 % and a false positive rate of 0.38 %.
Combining signal processing and image classification techniques using CNN models has shown promising results in EEG research.Chen, H et al. utilized mutual information (MI) algorithm to calculate brain graph data and combined it with a graph CNN model for detecting subjects with attention-deficit/hyperactivity disorder (ADHD) using EEG signals [20].In their study, they received an accuracy of 94.67 % on the test data.Ozcan, A.R. et al. employed a 3D-CNN model to classify features extracted from EEG signals, including statistical parameters and band power spectrum, in the context of seizure prediction.Their method achieved a sensitivity of 85.7 % and a false positive rate of 0.096 per hour [21].In our previous work, 3D-CNN classifier was proposed to classify the EEG alcoholic brain connectivity data and received the results of 96.25 ± 3.11 % accuracy [22].Moreover, this kind of method also employed in our previous research [23].Our 3D-CNN method provided the 97.74 ± 1.15 % accuracy, 96.91 ± 2.76 % sensitivity, and 98.53 ± 1.97 % results.
In this study, a bandpass filter using a 6th-order Butterworth zero-phase algorithm is applied to denoise the raw EEG data within the frequency range of 1-60 Hz.To extract features from the EEG signals, STFT spectrums provide a time-frequency representation of the data.The obtained spectrums were then transformed into graph data, which serves as the input for the Google-Net CNN models.To implement the approach in real-time, a sliding window technique with a duration of 1.35 s and a 1-s overlap is utilized.The experiments of this study were conducted on a Dell workstation equipped with an Intel I9-10900K CPU, 64 GB memory, and an Nvidia 2080ti GPU.MATLAB 2021b, along with the Deep Network Designer toolbox, was used for the deep learning work and model development.
In this paper, Section 1 briefly introduces the background and research problems.Section 2 describes the methodology which includes the signal processing, feature extraction and CNN model classification.Section 3 reports the results of the proposed method.Here, 'NS' is number of seizures, 'SP' is simple partial seizures, 'CP' is complex partial seizures, and 'GTC' is generalized tonic-clonic seizures.
M. Shen et al.
Section 4 discusses the statistical analysis in time-frequency spectrum analysis, brain rhythms selection, and evaluation of different CNN models.Moreover, the previous works of the database CHB-MIT were also listed and evaluated in Section 4. Section 5 concludes the paper.

Material
The CHB-MIT database, collected by Boston Children's Hospital, consists of EEG data from 23 subjects [24].The database included 5 males ranging in age from 3 to 22 years and 17 females ranging in age from 1.5 to 19 years.The EEG data in the CHB-MIT database was recorded using scalp EEG standard 10-20 system caps, with a sampling rate of 256 Hz.The data was collected from 22 bipolar channels.In addition, six specific electrodes P3-O1, FP2-F8, P8-O2, P7-T7, T7-FT9, and FT10-T8 were utilized in this experiment.These electrodes are strategically positioned closer to the frontal, temporal, and occipital regions, aligning with the seizure onset zones of the selected patients as outlined in Table 1.For this study, a subset of the CHB-MIT Database was selected, consisting of 16 patients.Patients who had seizures characterized by amplitude depression were excluded from the analysis.Upon closer examination of EEG signals for patients with high detection delay, it became apparent that seizures often initiated with amplitude depression before the onset of synchronization, characterized by high-amplitude oscillations [10].In these cases, where the algorithm lacks the capability to detect amplitude depression, significant delays were observed.The details of the selected subjects can also be found in Table 1 of the study.

Methodology
Fig. 1 illustrates the progression of the EEG data acquisition, main seizure detection algorithm and generating alarms of detection.To reduce computational costs, six specific channels are selected for analysis.These channels include P3 -O1, FP2 -F8, P8 -O2, P7 -T7, T7 -FT9, and FT10 -T8.To enable real-time application, a sliding window technique is employed.The sliding window has a size of 1.35 s, which corresponds to 345 samples of EEG data.This approach allows for continuous analysis of the EEG signals by processing them in overlapping segments.The EEG raw data is subjected to band-pass filtering using a 6th-order Butterworth algorithm.This filtering process helps to remove unwanted noise and artifacts from the EEG signals, focusing on frequency components within the range of interest.Time-frequency analysis is performed on the filtered EEG data to extract specific frequency bands of interest.This analysis provides information on how the frequency content of the EEG signals changes over time, capturing transient characteristics such as those seen during seizure activity.The resulting time-frequency spectra are converted into a graph representation with dimensions of 120 × 344, which serves as the input for the Google-net CNN model.

Pre-processing
In this study, a 6th-order Butterworth algorithm is applied as a band-pass filter to the raw EEG data.The purpose of this filtering process is to selectively retain frequency components within the range of interest while attenuating frequencies outside this range.By applying the 6th-order Butterworth filter, frequencies outside the desired range from 1 to 60 Hz are attenuated, reducing the impact of noise and artifacts on the EEG signals.The filtered EEG signals primarily contain frequency components within the specified range, which is important for subsequent analysis and feature extraction steps in the proposed method.

Short-time fourier transform
In this paper, the STFT is utilized as a time-frequency analysis technique to decompose the EEG signal into different frequency subbands and components.Specifically, the STFT is used to construct the time-frequency domain spectrum of the EEG signal within the frequency range of 20-60 Hz.This frequency range is of interest in this study, because it contains relevant information related to the detection of epilepsy seizures.The STFT provides a representation of the signal in both the time and frequency domains by applying a series of Fourier transforms to overlapping segments of the signal.Based on the Fourier transform, STFT analysis considers the window function of time varying EEG fragments which are converted into frequency and time axes.The formula of Fourier transforms and STFT analysis are shown in equations ( 1) and (2).
Fig. 1.The framework of seizure detection via STFT spectrum analysis and Google-net CNN model.
where ω is the selected frequency band, g(t-u) is the window function.Here, the window is selected as hamming 2 samples, and the number of overlapped samples is selected as 1.The input data from the K domains are donated by , and the epoch with the time index t is given as x(t).
The frame width used in the STFT is set to 128 samples.Consequently, the frequency resolution of the STFT spectrum power is 0:2:128, and the time resolution is 1:1:128.Additionally, the frequency resolution for the STFT is chosen as 2 Hz.After the STFT analysis, the one-channel EEG data is transformed into a 20 × 344 image-like dataset, as illustrated in Fig. 2, (a) is STFT spectrum of seizure free data and (b) is STFT spectrum of seizure active data.Following the extraction of STFT spectra for the six selected channels, the next step in the proposed approach is to merge them.The spectra from each channel are combined to form a single input matrix with a size of 120 × 344 per epoch.

Classification through deep learning method
According to the size of the input data through STFT analysis, a 29-layer Google-Net CNN is constructed and shown in Fig. 3.

Leaving one out training method
In the study, the leave-one-out training method is employed to evaluate the performance of the proposed method.This approach involves using one set of data as the test set while using the remaining data for training.In this experiment, a total of 16 models are trained, with each model being trained on a different combination of training and test data.For training purposes, the EEG raw data from 10 min before seizure onset and 5 min after seizure onset are used for each subject data.

Google-net convolutional neural network
In the proposed Google-net CNN model, the graph data with dimensions of 120 × 344 matrices are used as input.The model is trained using a learning rate of 0.01 and 30 epochs.The Google-Net CNN model consists of three individual convolution layers and a total of fifty-four convolutions using nine inceptions.All convolutions in the model utilize rectified linear activation function ReLU along with batch normalization.The three individual convolution layers in the model have sixty-four filters each, with convolution kernels of size 5 × 3, 5 × 3, and 1 × 1, respectively.The nine inceptions in the model are designed similarly, and the detailed architecture of an inception is depicted in Fig. 4.
In addition, the Google-Net CNN architecture includes a total of six pooling layers.These pooling layers consist of five max pooling layers and one average pooling layer.Six pooling layers of this architecture are selected as 1 × 3, 3 × 3, 3 × 3, 3 × 3, 3 × 3, 7 × 7 size and 1 × 3, 2 × 2, 2 × 2, 2 × 2, 2 × 2, 1 × 1 stride, respectively.To alleviate the occurrence of overfitting in the CNN model, a 40 % dropout layer is designed in the architecture.Finally, the last two layers of the CNN model consist of a fully connected layer and a Softmax classifier layer, respectively.Table 2 in the paper provides a summary of the Google-net architecture, including the

Results and comparison
To evaluate the proposed real-time seizure onset detection method, four main parameters are applied in this study which include the accuracy, sensitivity, FP rate and the delay of the seizure onsets.
Accuracy measures how well the proposed method correctly identifies seizure onset events and non-seizure events, and the formula is described in equation ( 3) where 'TP', 'TN', 'FP', 'FN' correspond to the true positive, true negative, false positive and false negative.Sensitivity, also known as recall or true positive rate, measures the ability of the algorithm to correctly identify seizure events or seizure onset.In active seizure detection, the goal is to accurately detect the occurrence of seizure activity in real-time EEG signals.Sensitivity quantifies the proportion of actual seizure events that are correctly detected by the algorithm.The algorithm of sensitivity is defined in equation ( 4)   The 'delay' of seizure onset refers to the temporal gap between the actual commencement of a seizure and its detection by the employed method.It gauges the precision of the detection algorithm in terms of time.Typically, this 'delay' is computed by measuring the time interval from the initiation of the seizure to the identification of its onset.The detail of 'delay' in the Patient 3 the first seizure onset is shown in Fig. 6 as follows.
Here, the doctor marked this onset from 361 s to 413 s, and our proposed method can detect this seizure beginning at 367 s.Thus, the 'delay' of this seizure is 6 s.

Results of the proposed method
In the real-time application based on Database CHB-MIT, 41,280 eigenvalues from 120 × 344 matrix graph data are selected.Google-net CNN model is utilized to evaluate the model using leaving one training method.As a result, Table 4 reported 97.74 % in accuracy, 98.90 % in sensitivity, 1.94 % in false positive rate, 9.85 s delay.

Comparison with different frequency bands
Determining which specific frequency range or band yielded the most promising results for real-time epilepsy seizure detection helps to reduce computing costs.Six frequency bands are considered in this experiment which include δ band (1-4 Hz), θ band (4-8 Hz), α band (8-12 Hz), β band (12-30 Hz), γ band (30-60 Hz) and selected frequency band (20-60 Hz), the comparison is described in Table 5.
According to Table 6, the selected frequency band brain has been verified as the best frequency band to detect the EEG epilepsy seizure signal.

Comparison with different deep learning models
Two CNN methods are compared with the proposed Google-net CNN model in testing data which contains VGG-net CNN and Squeeze-net CNN.In these comparison, the same input data with the same validation method are applied to conduct the EEG epilepsy signal detection and compared with the results of the proposed method.The input data for the Google-Net CNN model was a 120 × 344 matrix of imaged-like data, while for VGG-net CNN and Squeeze-net CNN models, the first three layers of the Google-Net CNN model were used to represent a 224 × 224 × 3 input.The results of these three deep learning methods are summarized in Table 6.
Table 7 indicates that the Google-Net CNN model outperformed the VGG-net CNN and Squeeze-net CNN models in real-time EEG epilepsy seizure onset detection.The Google-Net CNN model, with its complex architecture and multiple layers, seems to have demonstrated better capabilities in capturing the relevant patterns and features in the EEG data for seizure detection.frequency content over time.Moreover, CWT is more flexible in terms of the choice of wavelet function and the ability to adapt the analysis to different frequency bands or signal characteristics.This allows for better customization and optimization based on the specific requirements of the application.In this study, the CWT spectrum also tests in the EEG seizure onset detection work, and the frequency resolution are selected into 2 Hz as well.The details of CWT spectrum of seizure free and seizure active states are shown in Fig. 7 (a) and (b), and the results are described in Table 7.

Time-frequency domain analysis
It is evident that, from Table 7, the CWT spectrum with Google-net CNN model is hard to detect case 'Patient 09', which just achieved 46.91 % accuracy and 53.08 % FP rate.The STFT can provide a better performance in this research area.

Real-time application
In real-time applications, it is essential to process the data within a limited time frame to provide timely and actionable results.If the calculation time exceeds the overlapping time between consecutive windows, it can result in a delay that renders the detection impractical for real-time use.Therefore, selecting an appropriate window size is crucial to ensure the computational requirements of the method align with the desired real-time performance.Moreover, the parameter of delay is an important consideration when detecting EEG seizure onset.A large sliding window input, due to a larger window size, can result in a significant delay parameter.This delay refers to the time it takes for the detection algorithm to identify the onset of a seizure after it occurs.A large delay can reduce the clinical relevance of the detection method, as timely intervention or response may be compromised.In this study, a 1.35-s sliding window is selected to balance the detection and avoid significant delays in real-time applications.
In our previous work [13,14], the eigenvalue calculation step increased the overall calculation time.The decision to utilize the STFT spectrum directly as input for the Google-Net CNN model has proven to be an effective strategy for minimizing processing time and enabling real-time applications in EEG seizure detection.The processing time of the proposed method is just 0.02 s.

Previous works comparison
Comparisons results with the related works in EEG epilepsy seizure onset detection are listed in Table 8.The proposed method receives 97.74 % in accuracy, 98.90 % in sensitivity, 1.94 % in false positive rate, and 9.85-s results in the testing data.Compared with the previous related work, the proposed method can achieve satisfactory detection results using the CHB-MIT database.Moreover, the study highlights the efficiency of the proposed method in terms of processing time.It reports a processing time of just 0.02 s for every 1.35-s EEG episode.This indicates that the method is computationally efficient and capable of performing real-time seizure detection with minimal delay.
However, the proposed method currently cannot detect seizures characterized by amplitude depression.Addressing this limitation represents a key area for future research in EEG seizure onset.The experiment conducted in the study had a frequency resolution of 2 Hz instead of 1 Hz.This reduction in frequency resolution was due to limitations in the CPU memory capacity of the workstation used for the study.A higher frequency resolution could potentially provide more detailed information and improve the accuracy of the seizure detection.The selection of CNN models was limited in the study due to GPU memory constraints.As a result, the study could not include CNN models such as Efficient-Net CNN and ResNet-50 CNN.These models are known for their effectiveness in various computer vision tasks and may have provided additional insights and potentially improved the performance of the proposed method if they could have been utilized.

Conclusion
This study proposes an EEG-based real-time epilepsy seizure detection approach that combines signal processing techniques with deep learning methods, specifically utilizing time-frequency spectrum and Google-Net CNN models.This approach starts by applying the STFT method to extract signal features and remove redundant information.This helps to improve the robustness of epilepsy detection using EEG signals.The study then employs the Google-Net CNN model, designed specifically for image-like data, and compares its performance with the Squeeze-net and VGG-net CNN models.The evaluation results demonstrate that the Google-Net CNN model achieves better performance in classifying the image-like data.Additionally, the STFT method is found to be superior to the CWT in terms of reducing the false positive rate.The proposed real-time seizure detection method achieved impressive results on the CHB-MIT Database, with 97.74 % accuracy, 98.90 % sensitivity, 1.94 % false positive rate, and 9.85-s delay when utilizing the STFT spectrum.Based on these findings, the study concludes that the proposed method is suitable for real-time seizure detection and holds great potential for impactful clinical applications.The future work includes testing the seizure prediction aspect of the method in clinical applications using portable EEG devices and brainwave monitors and realize detect seizures characterized by amplitude depression.

Fig. 2 .
Fig. 2. (a) STFT spectrum of seizure free data, (b) STFT spectrum of seizure active data.The seizure active data is collected from case 'Chb01_03' from 3009s to 3011s, and the seizure free data is collected in the same case from 2800s to 2802s.

Fig. 4 .
Fig. 4. Architecture of an inception in Google-Net CNN model.

Fig. 5 .
Fig. 5.The training progress for 'Patient 1′ case via STFT analysis and Google-net CNN based on MATLAB 2021b software.

Fig. 6 .
Fig. 6.The first seizure onset detection of the Patient 3 through the proposed method.

M
.Shen et al.

Fig. 7 .
Fig. 7. (a) CWT spectrum of seizure free data, (b) CWT spectrum of seizure active data.The seizure active data is collected from case 'Chb01_03' from 3009s to 3011s, and the seizure free data is collected in the same case from 2800s to 2802s.

Table 1
Information about Database CHB-MIT of this study.

Table 2
The Google-Net CNN architecture used in EEG seizure detection.

Table 3
The validation accuracy of STFT spectrum with Google-net training models.

Table 4
Real time detection using STFT and Google-net method.

Table 5
The comparison between different frequency band.

Table 6
Results of 2 Deep learning methods and proposed methods.

Table 7
Real time detection using CWT and Google-net method.

Table 8
Comparison of the related works in EEG epilepsy seizure detection.